22. Video: Working With Outliers My Advice
Outliers Advice
Outliers Advice
Below are my guidelines for working with any column (random variable) in your dataset.
1. Plot your data to identify if you have outliers.
2. Handle outliers accordingly via the methods above.
3. If no outliers and your data follow a normal distribution - use the mean and standard deviation to describe your dataset, and report that the data are normally distributed.
Side note
If you aren't sure if your data are normally distributed, there are plots called normal quantile plots and statistical methods like the Kolmogorov-Smirnov test that are aimed to help you understand whether or not your data are normally distributed. Implementing this test is beyond the scope of this class, but can be used as a fun fact.
4. If you have skewed data or outliers, use the five number summary to summarize your data and report the outliers.